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ABSTRACT Cryptococcus neoformans is the leading cause of fungal meningitis worldwide. Previous studies have characterized the 
cryptococcal transcriptome under various stress conditions, but a comprehensive profile of the C. neoformans transcriptome in 
the human host has not been attempted. Here, we extracted RNA from yeast cells taken directly from the cerebrospinal fluid 
(CSF) of two AIDS patients with cryptococcal meningitis prior to antifungal therapy. The patients were infected with strains of 
C. neoformans var. grubii of molecular type VNI and VNII. Using RNA-seq, we compared the transcriptional profiles of these 
strains under three environmental conditions (in vivo CSF, ex vivo CSF, and yeast extract-peptone-dextrose [YPD]). Although 
we identified a number of differentiaUy expressed genes, single nucleotide variants, and novel genes that were unique to each 
strain, the overall expression patterns of the two strains were similar under the same environmental conditions. SpecificaUy, 
yeast cells obtained directly from each patient's CSF were more metabolicaUy active than ceUs that were incubated ex vivo in 
CSF. Compared with growth in YPD, some genes were identified as significantly upregulated in both in vivo and ex vivo CSF, 
and they were associated with genes previously recognized for contributing to pathogenicity. For example, genes vsdth known 
stress response functions, such as RIMlOl, ENAl, and CFOl, were regulated similarly in the two clinical strains. Conversely, 
many genes that were differentially regulated between the two strains appeared to be transporters. These findings establish a 
platform for further studies of how this yeast survives and produces disease. 

IMPORTANCE Cryptococcus neoformans, an environmental, opportunistic yeast, is annuaUy responsible for an estimated miUion 
cases of meningitis and over 600,000 deaths, mostly among HIV-infected patients in sub-Saharan Africa and Asia. Using RNA- 
seq, we analyzed the gene expression of two strains of C. neoformans obtained from the cerebrospinal fluid (CSF) of infected 
patients, thus creating a comprehensive snapshot of the yeasts' genetic responses within the human body. By comparing the gene 
expression of each clinical strain under three conditions (in vivo CSF, ex vivo CSF, and laboratory culture), we identified genes 
and pathways that were uniquely regulated by exposure to CSF and likely crucial for the survival of C. neoformans in the central 
nervous system. Further analyses revealed genetic diversity between the strains, providing evidence for cryptococcal evolution 
and strain specificity. This ability to characterize transcription in vivo enables the elucidation of specific genetic responses that 
promote disease production and progression. 



Received 17 December 2013 Accepted 23 December 2013 Published 4 February 2014 

Citation Chen Y, Toffaletti DL, Tenor JL, Litvintseva AP, Fang C, Mitchell TG, McDonald TR, Nielsen K, Boulvtfare DR, BicanicT, Perfect JR. 2014. The Cryptococcus neoformans 
transcriptome at the site of human meningitis. mBio 5(1 ):e01 087-1 3. doi:l 0.1 1 28/mBio.Ol 087-1 3. 
Editor Fran(;oise Dromer, Institut Pasteur 

Copyright © 201 4 Chen et al. This is an open-access article distributed under the terms of the Creative Commons Attribution-Noncommercial-ShareAlike 3.0 Unported license, 
which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original author and source are credited. 
Address correspondence to John R. Perfect, perfe001@mc.duke.edu. 



Cryptococcus neoformans is an environmental, encapsulated 
yeast and major opportunistic, neurotropic pathogen. Pa- 
tients with low levels of CD4"*" lymphocytes are particularly sus- 
ceptible. In sub-Saharan Africa, the epicenter of the AIDS pan- 
demic, C. neoformans is annually responsible for an estimated 
million cases of meningoencephalitis and approximately 600,000 
deaths (1). Over the past 30 years, many molecular and pheno- 
typic studies have identified a cohort of C. neoformans genes that 
clearly enhance but are not necessarily sufficient for virulence, 
such as the capsular polysaccharide, the ability to grow at 37°C, 
and the production of melanin, urease, phospholipase, and other 
factors (2). With the availability of genomic sequences, more re- 



cent studies have begun to analyze the transcriptome of C. neofor- 
mans under conditions that pertain to its pathogenicity (3-7). 
Using well-characterized laboratory strains of C. neoformans, re- 
ports have documented the transcriptional responses to high- 
temperature (8, 9), nitric oxide (9), iron (10), capsule-inducing 
conditions (11, 12), antifungal drugs (13), and survival within 
macrophages (14) and murine lungs (15). However, the transcrip- 
tional responses to stresses are dynamic and react to a variety of 
signals. To investigate the signals, transcription factors, and genes 
that enable C. neoformans to cause disease, it is critical to identify^ 
the genes that are transcribed by C. neoformans in the central ner- 
vous system (CNS). In addition, since most molecular transcrip- 
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TABLE 1 Number of RNA-seq reads obtained for each sample and mapping results 



GO HCl 



Characteristic 


YPD 


Ex vivo CSF 


In vivo CSF 


YPD 


Ex vivo CSF 


In vivo CSF 


Sequencing type 


Single end 


Paired end 


Single end 


Single end 


Paired end 


Single end 


Read length 


50 


100 


50 


36 


100 


36 


Total no. of reads 


96,331,255 


46,748,370 


74,481,942 


41,283,763 


52,103,334 


41,270,151 


No. of mapped reads 


92,513,698 


41,423,212 


70,860,588 


39,989,804 


36,610,869 


35,340,075 


Mapping (%) 


96.04 


88.61 


95.13 


96.87 


70.27 


85.63 



tional studies today have focused only on a few laboratory strains, 
such as H99, it is critical to investigate the transcriptional re- 
sponses of other wild- type strains (16). 

The ability to investigate the genetic responses of a pathogenic 
microbe within its host offers a powerful opportunity to elucidate 
the adaptive strategies that are essential for the microbe to survive 
the hostile host environment. We propose that the gene expres- 
sion profiles for yeasts in the host are both site and time specific. 
For instance, we hypothesize that human cryptococcal meningitis 
involves at least slk stages: (i) initiation of infection in the lungs 
following the inhalation of yeasts or spores; (ii) yeast survival and 
proliferation within the lung; (iii) dormancy of yeast cells in the 
host tissue; (iv) reactivation of latent infection with renewed yeast 
growth; (v) dissemination of the yeasts via the blood, reticuloen- 
dothelial, and lymphatic systems, bridging the blood-brain bar- 
rier; and (vi) proliferation of yeasts in brain tissue and the sub- 
arachnoid space. At these various sites and durations of infection, 
the transcriptional responses of C. neoformans wiU vary in re- 
sponse to the unique host environment. 

With this understanding, we have taken a very focused ap- 
proach to identify regulated genes, networks, and signature mark- 
ers that enable C. neoformans to survive and develop disease 
within the subarachnoid space. Using an experimental immuno- 
compromised rabbit model of cryptococcal meningitis and the 
well-studied H99 strain of C. neoformans, we previously identified 
genes that were highly upregulated in the subarachnoid space of 
the CNS, such as isocitrate lyase (ICLl). Nevertheless, additional 
studies determined that despite its upregulation, ICLl was not 
essential for disease production at this site (17). Conversely, a gene 
in the trehalose pathway, the trehalose-6-phosphate synthase gene 
[TPSl), was also significantly upregulated in the CNS (18), and 
this pathway was shown to be critical for disease production (19, 
20). 

In this study, we examined and compared the cryptococcal 
transcriptome from two HIV-infected patients with high burdens 
of cryptococcal cells in the cerebrospinal fluid (CSF) at a single 
time point during infection. The patients resided in different geo- 
graphical locations (Uganda and United States), and their strains 
represented the two global molecular types, VNI and VNII, of the 
dominant variety, C. neoformans var. grubii. We extracted RNA 
from the yeast cells taken directly from CSF and used lUumina- 
based RNA-seq technology to analyze their transcriptomes. These 
in vivo transcriptomes were compared with each other and with 
the transcriptomes of each strain after incubation in pooled hu- 
man CSF {ex vivo) or in vitro growth in YPD broth. These condi- 
tions replicated simple exposure to CSF and late-logarithmic- 
phase growth in nutritionally replete medium. 



RESULTS 

Analysis of two clinical C. neoformans var. grubii isolates. Iso- 
lates of C. neoformans var. grubii were obtained from two un- 
treated patients with cryptococcal infections and AIDS. Strain GO 
was obtained from a patient in Uganda participating in the COAT 
trial (http://clinicaltrials.gov/ct2/show/nct01075152), and strain 
HCl was isolated from a patient in the United States. Both strains 
possess the capsular A serotype and the a mating-type allele. Eight 
unlinked multilocus sequence typing (MLST) loci (21) were used 
to determine molecular genotypes of the two strains by compari- 
son with 30 representative strains from the global population of 
C. neoformans (see Table SI in the supplemental material). Previ- 
ously, we demonstrated that the global population of C. neofor- 
mans var. grubii strains can be categorized within one of three 
genetically isolated subpopulations or molecular types, identified 
as VNI, VNII, and VNB (22). Maximum parsimony analysis de- 
termined that the GO strain has molecular type VNI, and the HCl 
strain is molecular type VNII (see Fig. SI) (16, 21, 22). More 
specifically, strain GO has the same genotype as the globally prev- 
alent reference VNI strain UG2471 (see Fig. SI) and the MLST- 
designated genotype M3b (see Table SI) (16). The MLST geno- 
type of strain HCl matches that of the VNII reference strain C45 
(see Fig. SI), which was previously designated genotype M7c (see 
Table SI) (16). These two genotypes are globally distributed: other 
VNI isolates with the M3b genotype have been reported from 
clinical cases in Belgium, Brazil, Botswana, India, South Africa, 
and Uganda, and isolates of VNII with the M7c genotype have 
been found in South Africa and the United States (16, 21, 22). 

Six cDNA libraries were created for high-throughput lUumina 
sequencing. Two libraries were made from yeast cells that were 
directly isolated from the two patients. The remaining four librar- 
ies were made from cultures obtained by inoculating the crypto- 
coccal isolates in either YPD or ex vivo CSF (see Materials and 
Methods). In total, more than 350 million reads were generated, 
representing on average more than lOOX coverage of the C. neo- 
formans genome length for each sample (Table 1). 

We aligned all the sequence reads to the C. neoformans var. 
grubii H99 genome using Bowtie (23) and TopHat (24). For most 
samples, more than 85% of the reads mapped to the genome, and 
there were no major differences among the RNA obtained from 
the patients' yeast cells (in vivo CSF) and the in vitro cultures 
incubated in CSF or YPD (Table 1). We measured the expression 
levels in fragments per kilobase of exon model per mUlion mapped 
reads (FPKM) (25), and the gene expression level was defined as 
the sum of the FPKM values of its isoforms. Due to the high sen- 
sitivity of the RNA-seq technique, almost the entire gene set of the 
C. neoformans genome was detected as expressed (FPKM > 1) 
under all three conditions. Of the 6,976 genes predicted by the 
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FIG 1 (A) Hierarchical cluster analysis of gene expression based on log ratio of normalized read count per gene. The cluster displays expression patterns for six 
RNA-seq samples. The color key represents logj-normalized transformed counts. (B) Number of genes that are significantly up/downregulated (FDR < 0.2) 
between two different conditions in both strains. Pairwise comparisons were made between three conditions: ex vivo CSF versus YPD, in vivo CSF versus YPD, 
and in vivo CSF versus ex vivo CSF. The numerical designation indicates the number of genes that were up/downregulated in the first condition compared to that 
rn the second condition. See Table S2 in the supplemental material for a list of these genes. 



H99 genome, 6,778 (97.16%), 6,647 (95.28%), and 6,809 
(97.60%) genes were expressed in GO under YPD, ex vivo CSF, and 
in vivo conditions, respectively, and in HCl, 6,849 (98.18%), 
6,620 (94.89%), and 6,778 (97.16%) genes were expressed under 
the same three conditions. A total of 6,363 (91.21%) genes were 
expressed in both strains under aU conditions. In contrast, 207 or 
190 genes were exclusively expressed in either GO or HCl, respec- 
tively, under one or more of the three conditions. 

Analysis and evaluation of differential gene expression. In 
order to compare gene expression levels among different environ- 
mental conditions, we generated the whole gene expression profile 
using all the RNA-seq data by HT-Seq count (http://www-huber 
.embl.de/users/anders/HTSeq) and normalized using the DESeq 
package (26) in R (27). Hierarchical clustering indicated that the 
expression profiles of the two strains growing under the same 
conditions were more similar to each other than the patterns of 
transcription of each strain under different conditions (Fig. lA). 
This similarity is notable because the duration of CNS infection 
with each strain was unknown but likely differed. In addition, 
strains GO and HCl represent divergent molecular types of C. neo- 
formans var. grubii, VNI and VNII, respectively (22, 28, 29). These 
observations demonstrate the possibility that in vivo transcrip- 
tomes of additional strains might reveal more conserved in vivo 
genetic signatures. 

Although serial in vivo CSF specimens were not available for 
our study, the similarity of gene expression patterns of the two 
strains under different conditions allowed us to regard the two 
strains under the same conditions as biological replicates. This 
approach allowed us to identify common transcriptional re- 
sponses that are relevant for both genotypes. A modified Fisher's 



exact test with data fit to a negative binomial distribution of the 
DESeq package (26) was used to identify the differentially ex- 
pressed (DE) genes. We made pairwise comparisons between dif- 
ferent conditions, and the numbers of DE genes are as follows 
(Fig. IB): ex vivo CSF versus YPD, 129 (see Table S2A); in vivo 
versus YPD, 45 (see Table S2B); in vivo versus ex vivo CSF, 256 (see 
Table S2C). These comparisons indicate that the transcriptional 
profiles of the in vivo CSF and YPD samples were actually more 
similar to each other than to ex vivo CSF samples. These results are 
also consistent with the hierarchical clustering of the transcrip- 
tional profiles in the dendrogram display (Fig. lA). This finding 
might be explained by the potentially more active yeast cell growth 
in YPD and the host than is present during ex vivo CSF exposure. 
Gene ontology (GO) analysis was used to identify functional cat- 
egories overrepresented in the DE genes. Compared to yeast cells 
that were incubated ex vivo in CSF, yeasts from in vivo CSF up- 
regulated the expression of genes that were enriched in GO terms 
that related to cell metabolism, such as cellular biosynthetic pro- 
cesses (GO:0044249, P = le-12), gene expression (GO:0010467, 
P = 5.4e— 19), and structural constituents of ribosome (GO: 
0003735, P< le— 30). The increased expression of ribosomal pro- 
tein genes indicated that the yeast cells were more biosynthetically 
active within the human brain and subarachnoid space, where 
they were confronted by host inflammatory cells and processes. 
Compared to nutrient-rich YPD, human {in vivo and ex vivo) CSF 
is a nutritionally depleted environment. We hypothesized that the 
genes upregulated in the CSF {in vivo or ex vivo) might be more 
related to biological survival and/or fitness of C. neoformans in the 
human subarachnoid space. Indeed, 20 genes were identified to be 
significantly upregulated in in vivo and ex vivo CSF conditions 
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FIG 2 Heatmaps showing the expression data of 20 genes that were significantly upregulated in ex vivo CSF and in vivo CSF compared to YPD in both strains 
(A) and the expression data of six genes that were significantly upregulated in in viVo CSF compared to in ex viVo CSF and YPD in both strains (B). (C) Hierarchical 
cluster analysis of the 100 most divergently expressed genes between GO and HCl. 



compared with that in YPD (Fig. 2A; see also Table S3A in the 
supplemental material). As predicted, several of these identified 
genes have been reported as putative virulence or fitness genes in 
C. neoformans, such as CFOl (30), ENAl (31, 32), and RIMlOl 
(33). However, six of the 20 genes had no functional annotation; 
two of them (CNAG_00456 and CNAG_05159) were identified as 
the putative target genes of Gat201 (3), which is known as a regu- 
lator of virulence (34). 

To analyze the specificity of the transcriptional responses of 
C. neoformans in the human body, six genes were significantly 
upregulated in the in vivo CSF compared to the other two condi- 
tions (Fig. 2B; see Table S3B in the supplemental material). 
Among these genes, the sulfiredoxin gene (SRXl) has been re- 
ported to have a critical role in the resistance of yeasts and higher 
eukaryotes to oxidative stress (35, 36). In addition, the high ex- 
pression of SRXl might be critical for survival of the yeasts in the 
presence of monocytes/macrophages in CSF or microglial cells in 
brain parenchyma. Furthermore, another upregulated gene, SITl , 
has been reported to be essential for growth, melanin formation, 
and cell wall density of C. neoformans under low-iron conditions 
(37) and for invasion of epithelial cells by Candida albicans (38). 
The expression of SlTl as well as CFOl indicates the potential 
importance of iron at this site. 

To verify the potential relevance of genes that were shown by 
RNA-seq to be upregulated in CSF, four of the DE genes were 
selected for quantification of their expression by real-time reverse 
transcription-PCR (RT-PCR). Due to the low quantity of in vivo 
RNA sample of the GO strain, cDNAs of five different samples were 



used as the templates. Transcription of ACTi was used to confirm 
and normalize the concentration of mRNA among different sam- 
ples. The comparisons between RNA-seq and RT-PCR of these 
genes are shown in Fig. S2 in the supplemental material. Although 
some differences were shown in the HCl YPD sample, most of the 
data were consistent between the two. 

Strain-specific differentially expressed genes. As strains GO 
and HCl belonged to different genetic MLST subpopulations (see 
Table SI and Fig. SI in the supplemental material), we investi- 
gated their divergently expressed genes. A dissimilarity score was 
used to estimate the diversity of gene expression between the two 
strains. The 100 most divergently expressed genes were identified 
based on this dissimilarity score (Fig. 2C; see also Table S4). Based 
on the expression patterns, these genes can be organized into two 
groups: 69 genes (group 1) are expressed significantly more in GO 
than in HCl under all conditions, and 31 genes (group 2) are 
expressed significantly more in HCl than in GO under all condi- 
tions. Gene ontology analysis revealed that these 100 most diver- 
gently expressed genes evinced an enrichment for transporters (P 
= 1.54e-4). 

Substantial genomic variation exists among GO, HCl, and 
H99. RNA-seq technology was developed primarily to analyze 
global gene expressions. However, the high coverage and good 
quality of the data provided us with an efficient way to assess the 
genetic diversity among clinical strains. Compared to the standard 
H99 clinical isolate from the United States, 50,155 single- 
nucleotide variants (SNVs) and 156,880 SNVs were identified in 
GO and HCl, respectively (see Fig. S3 and Table S5 in the supple- 
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FIG 3 The proportions ofthe novel genes of GO (A) andHCl (B) that mapped to different species. (C) GO classification of novel genes of GO and HCl compared 
to those of H99. Percentages of genes annotated with certain GO terms compared to total number of annotated genes are shown in the figure. A hypergeometric 
test was used to test the enrichment of GO and HCl novel genes compared with H99 gene composition (*, P < 0.05; **, P < 0.01). 



mental material). Strains H99 and GO are prevalent VNI molecu- 
lar types with respective MLST genotypes of Mlb and M3b, and 
HCl is a global VNII strain with theM7c genotype (see Table SI). 
In GO, 21,059 SNVs were dispersed in 5,185 genes, and in HCl, 
104,027 SNVs were dispersed in 6,728 genes. The SNVs in the 
exon regions were further classified according to the open reading 
frame (ORF). In GO, 19,997 SNVs were detected in coding regions, 
of which 8,626 were nonsynonymous. In HCl, 99,542 SNVs were 
detected in coding regions, and 35,674 were nonsynonymous. The 
ORF regions contained 70 SNVs and 130 SNVs in GO and HCl, 
respectively, which can cause changes between amino acids and 
stop codons. In these SNVs, 32 SNVs in GO (see Table S6A) and 45 
SNVs in HCl were regarded as readthrough SNVs (see Table S6B), 
and the others were regarded as nonsense SNVs (see Table S6C 
and D). Due to changes in the stop codon, SNVs may affect pro- 
tein translation and/or stability and cause phenotypic changes be- 
tween the strains (39, 40). For example, one of the readthrough 
mutations has specifically occurred in GO OGGl (CNAG_03795), 
which encodes a DNA glycosylase that is putatively involved in the 
repair of oxidative DNA damage (41). Therefore, in a phenotypic 
screen, we tested whether this gene variation had a phenotypic 
consequence on the three yeast strains. A higher level of resistance 
to H2O2 was observed in GO than in H99 and HCl (see Fig. S4). 
Although further studies will be required to determine whether 
this phenotypic change was specifically caused by this single nu- 
cleotide mutation, this observation underscores the ability to 
identify areas of potential genomic differences that translate into 
specific and sometimes subtle differences in cryptococcal strain 
phenotypes. 

Gene gain and loss during evolution may also be critical for the 
functional differentiation between species and strains (42, 43). 
Therefore, to identify novel genes in both strains compared with a 



reference genome, we performed de novo assembly of the RNA- 
seq data using Trinity (44). A total of 18,260 and 24,664 contigs 
were assembled for GO and HCl, respectively, which belonged to 
8,996 and 10,268 unigenes (the nonredundant set of the contigs). 
We first aligned aU the contigs against H99 genome and mito- 
chondrial DNA sequences, but 53 unigenes in GO and 171 unige- 
nes of the more divergent HCl could not be aligned to them. We 
then used BLAST to identify the homologs of these unaligned 
unigenes in the NCBI nonredundant nucleotide (nt) and nonre- 
dundant protein (nr) databases. Forty-four unigenes in GO and 
118 unigenes in HCl were identified to homologous genes in the 
four fully sequenced Cryptococcus genomes (Fig. 3A and B). To 
investigate the putative functions of these novel genes, we used 
BLAST2GO (45) to annotate them. Nineteen genes in GO and 42 
genes in HCl were assigned corresponding gene ontology terms, 
and they were enriched in functions related to transport, localiza- 
tion, and membrane constitution in both GO and HCl compared 
with the standard whole genome composition of H99 (Fig. 3C). 

DISCUSSION 

In this study, we generated genome-wide transcriptional profiles 
of C. neoformans var. grubii from two untreated AIDS patients. To 
our knowledge, this is the first report of RNA-seq data generated 
from yeast cells taken directly from human CSF. When we aligned 
the sequencing data to the reference genome of strain H99, the in 
vivo data from strain GO received an extremely high ratio of hits 
(>95%), which indicated that RNA recovered from the in vivo 
samples was of high quality with very limited contamination. 
Analysis of in vivo transcription is especially challenging, because 
each in vivo sample is unique and cannot be duplicated, which is 
not ideal for applying classical strategies for quantitative analysis 
of transcriptional profiles (46, 47). To address this situation, we 
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treated the samples obtained from different patients as replicates 
in the analyses. The results demonstrated good reproducibility of 
the gene expression patterns between the two different clinical 
strains. Using this approach, we were able to compare and identify 
genes whose expression is associated with in vivo survival and 
growth. This strategy can be used in future analyses of in vivo 
samples. 

To identify genes whose expression is associated with in vivo 
growth, we assessed transcriptional profiles of the two strains un- 
der two defined in vitro conditions, YPD and ex vivo CSF, and 
compared them with expression in the in vivo environment. We 
hypothesized that because of low nutrients in human CSF, the in 
vivo gene expression profile would be more similar to that ob- 
served in ex vivo CSF than YPD, which is nutritionally replete. This 
hypothesis was supported by our previous data that the ability of 
C. neoformans mutants to survive in vitro in human CSF was 
closely correlated with their production of disease in a rabbit men- 
ingitis model (48). Contrary to our expectations, in this study we 
observed that the in vivo gene expression profiles of C. neoformans 
strains in human CSF were generally more similar to those in YPD 
than to ex vivo CSF (Fig. 1). Gene ontology enrichment analysis 
revealed that the downregulated genes in ex vivo CSF samples 
compared to those in in vivo human samples were significantly 
enriched in metabolic and cellular processes. These results suggest 
that the yeast cells are inhibited in their basic metabolic machinery 
in the ex vivo CSF; however, inside the host {in vivo CSF), yeast 
cells maintain an active metabolic gene network and actively pro- 
liferate. The results are also consistent with the observations that 
during human disease, the number of yeast CFUs can increase to 
more than 1 million CFU/ml in an HIV-infected patient (49), and 
in the immunosuppressed rabbit model, abundant yeast growth 
occurs in the subarachnoid space (50). There are several nonex- 
clusive explanations of why C. neoformans growth measured by 
viable quantitative yeast counts is inhibited in ex vivo CSF but 
active in vivo. First, in humans, CSF is constantly being renewed, 
and therefore in vivo CSF has a much higher concentration of 
nutrients than ex vivo CSF. Second, ex vivo CSF has a much more 
alkaline pH compared to that of in vivo CSF, which can affect 
transcription. Third, C. neoformans cells colonize human CNS 
after going through several stages of infection, which may extend 
for months or even years. This slow progression through the in- 
fection process can cause genetic and epigenetic adaptations and 
subsequent changes in the gene expression patterns, which are 
detectible in yeast cells obtained directly from humans but lost in 
vitro because the yeast cells exposed to ex vivo CSF were propa- 
gated in culture prior to CSF exposure. 

Although the growth state of yeast cells in ex vivo CSF and 
human subarachnoid space {in vivo) may not be the same, there 
are stresses from this specific fluid environment that are similar 
and unique. Therefore, the 20 genes that we identified as signifi- 
cantly upregulated in the two CSF conditions compared to YPD 
might specifically represent the response of C. neoformans to cer- 
tain CSF stresses or signals. Based on the annotation information 
in FungiDB (51), most of these genes are categorized into three 
groups: catalytic activity-related genes (6/20), transporters (5/20), 
and genes of unknown function (6/20). Two of these genes have 
been reported as necessary survival genes for C. neoformans. We 
previously demonstrated that the ATPase transporter gene ENAl 
is essential for survival of this yeast in CSF (48). Jung et al. showed 
that the ferroxidase, CFOl, is required for the utilization of the 



biological protein, transferrin, which is an important iron source 
for C. neoformans during infection (30). We did not observe any 
evidence for differential regulation of any of the classical virulence 
factors (capsule, melanin, high-temperature growth, urease, 
phospholipase) at this advanced clinical state of infection, most 
likely because these classical virulence factors are expressed during 
the early stages of infection in the lungs and/or during dissemina- 
tion. Conversely, our data suggest that basic metabolic pathways 
and stress response genes are essential for survival and successful 
propagation within the subarachnoid space. 

As we investigate in vivo gene expression data between humans 
and animal models and compare those with the data obtained for 
other fungal infections, it is possible to recognize conserved pat- 
terns of gene regulation. For example, the isocitrate lyase gene 
{ICLl) and the glyoxylate pathway are upregulated in both human 
and experimental animal models of cryptococcal meningitis. In 
addition, recent work by Cheng et al. demonstrated the upregula- 
tion of RIMlOl in peritoneal Candida infections in mice (52), 
which is similar to our observation of CNS infection with Crypto- 
coccus in humans. With the analyses of further cases, we expect to 
find common conserved pathways that are characteristic for de- 
velopment of the disease. 

To investigate the six differentially regulated genes with un- 
known functions, we searched the annotated fungal genomes 
available in FungiDB and found that four of them (CNAG_02 118, 
CNAG_04837, CNAG_05632, and CNAG_06493) are restricted 
to Cryptococcus lineage. Such lineage-specific genes are often im- 
portant for species evolution. For example, it has been demon- 
strated that in primates, lineage-specific genes are essential for 
human brain evolution (53). We hypothesize that in Cryptococcus, 
lineage-specific genes maybe important for adaptation of Crypto- 
coccus in its survival within certain human body sites. More stud- 
ies are required to test this hypothesis, and we have now begun to 
identify the genes of interest. 

Transporters are groups of genes that play important roles in 
yeast biology. In our study, transporter genes were frequently 
identified in many different analyses. Based on our results, we 
divided the transporter genes into two groups. One group in- 
cludes the transporters that are conserved among different strains 
or even distantly related species, and they are essential for Crypto- 
coccus. Several of these transporters have been identified and stud- 
ied previously (54-57). Three genes from this group of transport- 
ers with previously defined function, CFOl , ENA 1 , and SITl , were 
significantly upregulated in both in vivo and ex vivo CSF compared 
to in YPD. These genes are important for virulence, drug resis- 
tance, starvation response, intracellular survival, and other basic 
functions in Cryptococcus (30, 31, 37, 48, 58). The second group of 
transporters includes rapidly evolving and variable genes that may 
not be particularly essential for the yeast's pathobiology. These 
transporters have different expression patterns or even different 
genetic compositions among genetically related strains. In our 
study, these transporter genes were enriched among the most di- 
vergently expressed and novel genes. Most of these genes have 
been annotated as carbohydrate transporters, such as a sugar 
transporter, monosaccharide transporter, galactose transporter, 
and hexose transporter. Although the exact function of these 
transporters is unknown, they might be involved in the biosynthe- 
sis of the polysaccharide capsule of Cryptococcus, which is one of 
its main virulence factors (59). In addition to carbohydrate trans- 
porters, we also observed several novel genes that belonged to the 



6 mBio' mbio.astm.org 



January/February 2014 Volume 5 Issue 1 e01087-13 



C. neoformans Transcriptome of Meningitis 



major facilitator superfamily (MFS) transporters and ATP- 
binding cassette (ABC) transporters, which have been shown to 
play a role in multidrug resistance of fungi (60). We hypothesize 
that this group of transporters may contribute to variable strain- 
specific properties and even explain the differences in disease 
manifestations and treatment outcomes between strains. Indeed, 
when the two strains were compared for the most divergently 
expressed genes, transporter genes were commonly identified, and 
these findings support the hypothesis that the microevolution of 
these strain-specific properties frequently involves transporter 
functions. However, more studies are necessary to evaluate the 
potential role of the second group of transporters in the pathogen- 
esis of C. neoformans and how genetic changes in these transport- 
ers influence a strain's unique phenotype. 

This first analysis of cryptococcal transcriptomes during infec- 
tion in the CNS has demonstrated a series of important principles. 
First, the transcriptomes of cryptococcal cells from the human 
host can be captured and analyzed. Second, despite genetic differ- 
ences between strains and the duration of their infection, the tran- 
scriptomes from these strains are remarkably similar and suggest 
that a specific consensus pattern of gene expression may be asso- 
ciated with CNS infection. Third, C. neoformans is metabolically 
active in the human CSF. Fourth, specific genes with known vir- 
ulence or survival properties are identified using in vivo transcrip- 
tion profiling, which validates this method. Moreover, genes and 
pathways essential for survival of yeasts in the human body are not 
limited to classical virulence pathways or phenotypes. Fifth, 
through the combination of RNA-seq and genome analysis, iden- 
tification of SNVs and other genetic diversities may predict differ- 
ences in gene expression and/or function in individual strains and 
identify the potential impact of these mutations on yeast micro- 
evolution. Additional strategies are needed to determine the func- 
tions of nonconserved genes that lack homology to annotated se- 
quences, as they may be essential for the pathogenicity of 
C. neoformans. Comparative analyses of transcriptomes may iden- 
tify the potential importance of these unrecognized genes. This 
study is just the beginning, but it illustrates the potential to char- 
acterize fungal transcriptomes and then relate them to the pro- 
gressive states of disease. They may provide insights into how a 
strain(s) produces disease and how some strains might produce a 
genetic signature that could even predict their infection outcome 
or response to therapy. 

MATERIALS AND METHODS 

DNA manipulations and phylogenetic analyses. Genomic DNA was iso- 
lated using a MasterPure yeast DNA purification kit (Epicentre Biotech- 
nologies, Madison, WI). As previously described, mating types were de- 
termined by a PCR-based method using primers for the a or a allele of the 
STE20 gene for the strains (61 ). For MLST, eight previously described loci 
(CAP59, GPDIJGSI, LACl, PLBl, SODl, URA5, and TEFJ) were used in 
this study for PGR amplification (22, 62). Amplicons were sequenced with 
Sanger sequencing, and the reads were edited manually. To identify the 
genotypes of GO and HCl, their MLST sequences were compared with a 
global set of reference strains of C. neoformans (see Table SI). The se- 
quences of aU the loci were aligned by GlustalW (63). The maximum 
parsimony (MP) tree was identified with heuristic searches based on 500 
random sequence additions using PAUP (64) and was rooted using 
C. neoformans strain JEG21 (serotype D). 

RNA-seq sample preparation. Yeast cells (approximately 10* to 
10^ GFUs) were pelleted in GSF in a microfuge tube containing 20 to 30 p,l 
of 1 -mm glass beads, and the pellet was stored at — 80°C. The frozen pellet 



was lyophilized and vortexed to powder, and RNA was extracted using a 
modified Trizol/Qiagen procedure as follows: yeast cells were lysed in 
700 /j,l Trizol and incubated for 5 min at room temperature. After 140 /j,l 
of chloroform was added, the tube was shaken for 20 s and incubated at 
room temperature for 3 min. The sample was centrifuged at 10,000 rpm 
for 15 min at room temperature. The aqueous phase was separated and 
mbced with equal volumes of 80% ethanol and immediately applied to a 
Qiagen RNeasy minicolumn (Qiagen; catalog number 74014) and centri- 
fuged at 13,000 rpm for 1 min, and RNA was isolated according to the 
manufacturer's protocol. The yield of total RNA ranged between 1 and 
25 mg/ml. 

The GSF samples were part of the Duke IRB-approved database and 
Specimen Repository for Infectious Disease Related Studies (PR0005314) 
in which patients are deidentified and clinical information is limited. The 
HGl strain came from a patient in the United States, and strain GO was 
from a Uganda patient. YPD broth (1% yeast extract, 1% Bacto peptone, 
2% dextrose) and sterile human CSF (pool of 10 to 20 individuals) were 
prepared as previously described (48) and used for in vitro or ex vivo 
incubation of the strains. The strains were grown in YPD broth for 16 h at 
37°G and then harvested. Yeast cells in stationary phase after culture in 
YPD overnight at 37°G were exposed to human GSF for 9 h, during which 
the CSF was replenished every 3 h, and the cells were then harvested. All 
harvested cells were snap-frozen and lyophilized for total RNA isolation. 

RNA sequencing and quantitative analysis. Total RNAs from the two 
strains (HGl and GO) under the various conditions were extracted using 
Trizol (Invitrogen) according to the manufacturer's instructions. The 
mRNA samples for RNA-seq analysis were performed using a TruSeq 
RNA sample preparation kit (lUumina, San Diego, CA) . The cDNA librar- 
ies were sequenced on the lUumina GAII and the lUumina HiSeq 2000 
(lUumina, San Diego, CA) instruments. The C. neoformans var. grubii H99 
genome with annotations (2012 release) were downloaded from the 
Broad Institute (http://www.broadinstitute.org) and used as a reference. 
Sequencing reads of each sample were mapped to a reference genome 
using TopHat 2.0.0 (24). Subsequently, we used the HT-Seq count (http: 
//www-huber.embl.de/users/anders/HTSeq) to convert the mapped reads 
to read counts per gene. In the quantitative analysis, the two isolates ex- 
posed to the same conditions were treated as biological duplicates. We 
evaluated the expression differences using a test based on a negative bino- 
mial distribution, which was implemented in the R package DESeq (26). 
Three comparisons were made among the different conditions: (i) in vivo 
human versus YPD; (ii) in vivo GSF versus ex vivo GSF exposed; (iii) ex vivo 
GSF exposed versus YPD. The false discovery rate (FDR) was calculated by 
the "p.adjust" function in R using the Benjamini and Hochberg (65) 
method and controlled at 20%. AU selected FDR-adjusted P values corre- 
sponded to raw P values below 0.01. The number of fragments per kilo- 
base of exon model per million mapped reads (FPKM) was calculated 
according to Cufflinks (25). A dissimilarity score was defined as follows: 

2P,(A)P,(B) 

dissimilarity score =1 — 2; =- — 

' Pj(A)+Pf(B) 
where P;(A) and P;(B) represent the normalized expression value of gene 
! in strain A and strain B under the same condition. 

Quantitative real-time RT-PCR validation of RNA-seq data. Total 
RNA of each sample was treated with DNase (Turbo DNA-free kit; Am- 
bion) to avoid genomic DNA contamination. Reverse transcription was 
performed using the RETROscript kit (Ambion). Four genes were selected 
for validation of RNA-seq results using RT-PCR. Primer pairs were de- 
signed to span exon-exon junctions using Primer3, andACTJ was used as 
an internal control. All the primer sequences are as follows (5' to 3'): 
ACTJ, CGAGAGTGTCGGCATTTAGGA (forward) and GAGGAAGATG 
GATAGGGAGGAT (reverse); GNAG_05431, AAGCGCCGTGAGAGAG 
GTAG (forward) and GGAGGTCGAAGAGAGTGGA (reverse); 
GNAG_06493, AAGAGAGGGGATCGAAGGTT (forward) and GTTAG 
AGAATGGTGCGGTCG (reverse); CNAG„00654, TGTCAGAACCAGG 
ATGAGGG (forward) and CGGGCGTACGTTGCGTAATG (reverse); 
GNAG_00815, AGGTGGGATGGATGGGTTTT (forward) and TGGAG 
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GTTGCGGTCACAATA (reverse). The amplifications were conducted in 
a total volume of 20 /j,l, containing 1 X SYBR green (iTaq universal 
SYBR green supermix; Bio-Rad), 300 nmol/liter of both primers, and 1 fjl 
of diluted cDNA. The amplification was conducted as follows: 5 min at 
95''C, followed by 35 cycles consisting of 30 s at 95°C, 30 s at 54°C, and 30 s 
at 72°C. Finally, melting curve analysis was performed from 60°C to 95°C, 
with increments of 0.5°C per 10 s. Amplification, melting curve analysis, 
and detection were conducted with the MyiQ single-color, real-time PGR 
detection system (Bio-Rad). 

RNA-seq variant calling. BAM files that were generated by TopHat 
were used as inputs. The genome analysis toolkit (GATK version 2.4.9) 
(66) was used to perform variant calling. To solve the incompatible scores 
between TopHat and GATK, all the mapping scores with 255 in TopHat 
were reassigned to 60 in GATK. The filter used in GATK was 
"-stand_callconf 50 -stand_emitconf 10 -dcov 500." All the SNVs were 
annotated using VGFannotator (http://vcfannotator.sourceforge.net) rel- 
ative to the H99 genome. 

Detection and characterization of novel genes. To obtain high- 
quality assemblies for the transcriptomes, we used only 100-bp paired- 
end reads for de novo assembly. Reads that contained more than 10% of 
the bases with phred quality scores below 20 were removed using FASTX 
toolkit (http://hannonlab.cshl.edu/fastx_toolkit) . De novo assembly of the 
filtered reads was performed by Trinity (44). To detect the unigenes that 
either do not exist or were highly divergent from the reference genome, we 
aligned all the contigs to the C. neoformans var. gnibii H99 genome and the 
mitochondrial DNA sequence using BLAT (67). A novel gene was defined 
as one for which all the contigs could not be aligned to the reference 
sequences. All the unigenes that met this criterion were aligned to the 
NCBI nonredundant nucleotide (nt) and protein database (nr) using 
BLAST. AH the BLAST hits with E values less than le-5 were kept for 
further analysis. 

RNA-seq data accession number. RNA-seq data have been made 
publicly available at GEO (http://www.ncbi.nlm.nih.gov/geo) under ac- 
cession number GSE51573. 
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